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The invention relates to a method for 
character recognition comprising the steps of: 
detecting a union of characters, preprocessing the 
union of characters, comparing the preprocessed 
union of characters with one or more template 
symbols, and applying a decision rule in order 

to either reject a template symbol or decide that V^^/V (1 ), Xnl'^ ) 

the template symbol is included in the union \^ ^ 

of characters. The step of preprocessing the 
union of characters comprising the steps of: 

reoresenting the union of characters as one or . ^ . l r 

more curved, and parameterising said curve or curves, and legaiding various classes of transformation formmg one or more shapes for said 
cuive or curves. The step of comparing comprises the steps of: forming one or more geometric proximity measures, and determmmg for 
every shape the values of said geometric proximity measures between said shape and correspondingly determuied shapes for the template 
symbols. Finally, die step of applying a decision rule comprises the step of: selecting one or more template symbols m considerauon of 
said values. 
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CHARACTER RECOGNITION 



Technical Field 

The present invention relates to a method for char- 
acter recognition according to the preamble to claim 1. 
5 "Character" is in this compound neutral regarding number, 
i.e. separate characters, such as letters and niomerals, 
as well as compositions of several characters, such as 
words, are here referred to. Both generally used charac- 
ters and imaginary characters are, of course, included. 

10 

Background Art 

There are a plurality of known methods for character 
recognition, especially for recognition of handwritten 
characters, which requires especially good interpretation 

15 of the character. Several of the known methods are based 
on the detection of each stroke of the pen when a hand- 
written character is being formed. Geometric characteris- 
tics, such as directions, inclinations and angles of each 
stroke or part of a stroke, are determined and compared 

20 to corresponding data for stored, known characters. The 
written character is supposed to be the stored character 
whose geometric characteristics best correspond to the 
geometric characteristics of the written character. The 
geometric characteristics are related to an xy-coordinate 

25 system, which covers the used writing surface. Such known 
methods are disclosed in, for instance, US-5,481,625 and 
US-5, 710, 916. A problem in such methods is that they are 
sensitive to rotation- For example, if one writes diago- 
nally over the writing surface, the method has difficul- 

30 ties in correctly determining what characters are being 
written. 

US-5, 537, 489 discloses a method for preprocessing 
the characters by normalising them. The written character 
is sampled, and each sample is represented as a pair of 
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coordinates. Instead of solely comparing the characters 
in the coordinate plane, the transformation is determined 
which best adjusts the written character to a model char- 
acter. Indirectly, also rotation and certain types of de- 
5 formations, which the above-mentioned methods cannot han- 
dle, are thus taken into account. The transformation is 
used to normalise the written character. In particular, 
the character is normalised by being translated so that 
its central point is in the origin of coordinates, where 

10 also the central point of the model character is found, 

after which the character is scaled and rotated in such a 
manner that it corresponds to the model character in the 
best possible way. 

A disadvantage of this method is that the normalisa- 

15 tion requires computing power and that in any case the 

choice of model characters has to take place by determin- 
ing what model character the written character resembles 
the most. 

Another method which certainly can handle rotations 

20 is disclosed in US-5, 768,420, In this known method, curve 
recognition is described by means of a ratio that is 
named "ratio of tangents''. A curve, for instance, a por- 
tion of a character is mapped by selecting a sequence of 
pairs of points along the curve, where the tangents in 

25 the two points of each pair intersect at a certain angle. 
The ratio between the distances from the intersection 
point to the respective points of the pair is calculated 
and makes up an identification of the curve. This method 
is in principle not sensitive to translation, scaling and 

30 rotation. However, it is limited in many respects. TUDOve 
all, it does not allow certain curve shapes in which 
there are not two points whose tangents intersect at the 
determined angle. It is common that at least portions of 
a character comprise such indeterminable curve shapes for 

35 a selected intersection angle. This reduces the reliabil- 
ity of the method. 
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Summary of the Invention 

An object of the invention is to provide a method 
for character recognition, which does not have the above- 
mentioned disadvantages, and which to a larger extent ac- 
5 cepts individual styles of handwritten characters and un- 
usual fonts of typewritten characters, and is easy to im- 
plement with limited computing power. 

The object is achieved by a method according to the 
invention as defined in claim 1- 

10 According to the invention, the term "template sym- 

bol'' means, as defined in the claim, everything from a 
portion of a separate character, the portion being, for 
instance, an arc or a partial stroke and the character 
being a letter or a numeral, to compound words or other 

15 complex characters. In a similar way, the term "union of 
characters" means everything from a separate character to 
compositions of several characters. The extension of the 
mentioned terms will be evident from the following de- 
scription of embodiments. 

20 

Brief Description of the Drawings 

The invention and further advantages thereof will be 
described in more detail below by way of embodiments re- 
ferring to the accompanying drawings, in which 
25 Fig. 1 shows an example of a union of characters 

which comprises a handwritten character, and which illus- 
trates some steps in a preferred embodiment of the method 
according to the invention. 

Figs 2 and 3 show examples of various transforma- 
30 tions of a union of characters which comprises a hand- 
written character. 

Fig. 4 shows an example of recognition of a union of 
characters which comprises several characters, and 

Fig- 5 shows an embodiment of a device for carrying 
35 out the method. 
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Description of Embodiments 

According to the invention, the method for character 
recognition comprises a number of main steps: 
a) a union of characters is detected, 
5 b) the union of characters is preprocessed, 

c) the preprocessed union of characters is compared 
with one or more template symbols, and 

d) a decision rule is applied in order to determine 
whether or not any one of the template symbols is in- 

10 eluded in the union of characters. 

According to a preferred embodiment, the various 
main steps are carried out in accordance with the follow- 
ing description. The embodiment is preferably intended 
for recognising unions of characters that are written on 

15 a pressure-sensitive display, which is available on the 
market. It should be noted that the invention is just as 
useful for recognising typewritten as handwritten unions 
of characters that originate from a hard copy, which for 
instance is scanned into a computer. An embodiment which 

20 is particularly adapted to recognition of typewritten, 

scanned unions of characters will be described below. In 
the following description of the steps of this embodi- 
ment, it will for the sake of simplicity be presumed that 
the union of characters comprises one character. 

2 5 In step a), points on the character are detected at 

regular time intervals at the same time as the character 
is being written on the pressure-sensitive display. Thus, 
an ordered sequence of points is obtained. In step b) , 
the following operations are carried out. By interpola- 

30 tion between the points, a curve representation of the 
character is generated. The curve representation com- 
prises one or more curves which pass through the sequence 
of points. Any lifting of the pen is detected to prevent 
the interpolation from extending over spaces between 
35 points where the pen has been lifted. The interpolation 

results in characters such as "t", "a" and "s" being con- 
sidered to consist of one or more curves. Each curve or 
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composition of curves is perceived holistically as an in 
divisible geometric unit. This means, for instance, that 
the method according to the invention in many ways oper- 
ates on complete characters (global character interpreta 
5 tion) • Each point is represented as two coordinates, 

which indicate the position of the point in the limited 
plane that the display constitutes. One of the coordi- 
nates which in the following will be called xl indicates 
the position laterally and the second, which will be 

10 named x2 below indicates the position in the vertical di 
rection. The curve is conveniently parameterised as 
<f>(t)-=(^l(t) , ^2(^>)f a < t < h, where, for the sake of 
simplicity, a=0 and jb=l and are sampled in a number n of 
points ti < t2". < tn according to any suitable parameter 

15 isation rule. To begin with, arc length is the rule ac- 
cording to which the parameterisation is preferably car- 
ried out, which means that the points become equidis- 
tantly located. It is to be noted that because of the ir 
regular speed of motion of the writer, this is not the 

20 case with the initial coordinate samples- The use of the 
arc length can be seen as a standardisation of the param- 
eterisation, which facilitates the following comparison 
with template symbols, which are parameterised and sam- 
pled in a corresponding manner. For some classes of 

25 transformation it may be necessary to reparameterise, 
which will also be described below. 

In order to compare the character with template sym 
bols it is necessary to shape a representation which al- 
lows quantitative comparisons. Some deviations from a 

30 template symbol defined in advance have to be allowed, 

i.e. for instance an "a" has to be interpreted as an "a" 
even if with respect to its shape, it differs to a cer- 
tain extent from the template symbol. According to the 
invention, a definition is applied that is based on dif- 

35 ferent transformations. Depending on demands for flexi- 
bility and exactness, various classes of transformation 
may be allowed, the classes comprising one or more types 
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of transformation such as translation, rotation, scaling, 
shearing and reflection. This is illustrated in Figs 2 
and 3- Fig. 2 shows a handwritten "a" in box 2a. The 
other three characters have been subjected to various af- 
5 fine transformations. The class of transformation which 
is comprised by the affine transformations allows rota- 
tion, shearing, reflection, scaling and translation. The 
characters in boxes 2b and 2c have been subjected to 
translation, rotation, scaling and shearing in relation 
10 to the character in box 2a. The character in box 2d has 
been subjected to translation, reflection, rotation and 
scaling. 

Fig. 3 illustrates positive similarity transforma- 
tions that only comprise scaling, rotation and transla- 

15 tion. In accordance with this embodiment of the method 
according to the invention, permissible deviations are 
limited to positive similarity transformations. This 
means that a written character or part of a character, 
which by a suitable combination of scaling, rotation and 

20 translation can be brought into correspondence with a 

template symbol, is interpreted as the same character or 
part of the character which is represented by the tem- 
plate symbol. The correspondence does not have to be com- 
plete, which will be described below, 

25 The representation, which according to this inven- 

tion is to be preferred, is provided by forming an in- 
variant of the parameterised curve. Useful invariants 
should according to the invention allow an interpretation 
that is close to the interpretation a human being makes 

30 of a particular character. This means that characters 

which a human being with great accuracy of aim is able to 
interpret correctly, i.e. interpret as the characters 
which the writer says that he or she has written, should 
be interpreted correctly and with great accuracy of aim 

35 by the method according to the invention. It is thus im- 
portant that a constructed invariant is selective in a 
well-balanced way. According to the invention, invariants 
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are therefore constructed on the basis of the following 
definition. If ^ is a parameterised curve according to 
the above, and G is a group of transformations of curves, 
then the union is named d(^) ^{\f/\\i/^g (iff) e G) and equiva- 
5 lent rewritings thereof are called the shape of It 

will be appreciated by those skilled in the art that the 
definition allows many possible invariants, which, how- 
ever, all have in common that they handle the curve as 
the above-mentioned indivisible unit. 

10 According to the preferred embodiment of the inven- 

tion, the shape corresponding to the group of positive 
similarity transformations is given by s =linhull ( { (^i , 
(*2)/ (•"<^2^ (1/0)/ (0,1)}), i.e. a linear space con- 

structed from the parameterised curve ^. As will be ap- 

15 predated by those skilled in the art, s(ji) is precisely 
an equivalent paraphrase of d((^;. In practice, the use of 
this shape implies that all parameterised curves, which 
can be transferred into each other by positive similarity 
transformations, have the same linear space as shape- 

20 On the contrary, according to another embodiment of 

the invention, affine transformations are permissible. 
Then the shape, after rewriting, is given by s =linhull 
4>2r which is described in more detail in, for in- 
stance, "Extension of affine shape". Technical report, 

25 Dept. of Mathematics, Lund Institute of Technology 1997, 
by R. Berthilsson. 

In step c) , the shape of the written character is 
compared with correspondingly formed shapes for a number 
of template symbols. In this embodiment of the invention, 

30 the template symbols are by way of introduction provided 
by letting a user write by hand on the display all the 
characters that he or she might need, one at a time, 
which are processed in accordance with the above- 
described steps a) and b) and are stored as template sym- 

35 bols. As mentioned above, each template symbol comprises 
one or more curves, which represent a portion of a char- 
acter or the complete character, which in practice means 
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that several template symbols may be required to build a 
character. However, as will be further developed below, a 
template symbol can, on the contrary, also represent a 
sequence of several characters. 
5 According to the invention, one way to compare the 

shapes is to use a geometric measure of proximity. For 
the above formed shapes according to the preferred em- 
bodiment and the alternative embodiment, respectively, a 
geometric proximity measure ^l for shapes, which comprise 
10 linear sub-spaces within the space of possible parameter- 
ised curves S, may be used- An example of such a geomet- 
ric proximity measure is: 

15 

where RS represents the Hilbert-Schmidt norm and J 
is the identity. 

In the definition, sK^) and s (^) represent such lin- 
ear sub-spaces. Ps(^; ^^cf Pg further represent orthogo- 
20 nal projections onto s(j>; and s {xj/) ^ respectively. KS rep- 
resents the Hilbert-Schmidt norm and I is the identity. 
The calculation of the geometric proximity measure ^i in- 
cludes selecting a scalar product. 

A general example of a scalar product of two func- 
25 tions ^(t) and ^/(t) with values in 5-^^ is: 



where cZmj^ are positive Radon measures and • represents the 
30 scalar product on 5^- 

Since each sampled curve comprises a plurality of 
points, each with two coordinates, it is convenient to 
use matrix notation for comparative processing of the 
shapes . The steps of describing the curves in matrix no- 
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tation and constructing a geometric proximity measure can 
be described and carried out mathematically as follows. 

Let us name the curve of the detected character 
^(t) = (^2(t)' V^2(t:))f Ostsl, and the curve of a template 
symbol ^(t)=(^i(t), ^2(t)), Ostsl. By sampling the curve 
at the points of time 0=ti< t2...< tn=l/ the following ma- 
trices may be formed 
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The matrices are QR-f actorised in a manner known to those 
skilled in the art, such that Mi=QiRi and M2=Q2R2' where 
Ql and Q2 are orthogonal matrices and Ri and R2 are upper 
triangular. The matrices Qi and Q2 represent the shapes 
of the detected character and the template symbol, re- 
spectively, given the parameterisations and the sampling. 

A geometric proximity measure m may be constructed 
as follows 



30 



where the norm | • If denotes the Frobenius norm. When 1=0 
and dmo is the usual Lebesgue measure on the interval 
[0, 1], in the above general example of a scalar product, 
exactly this geometric proximity measure is obtained. The 
choice of scalar product affects the performance of the 
method. 

After the determination of the values of the proxim- 
ity measure between the shape of the detected character 
and the shapes of all or a sub-union of the template sym- 
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bols, step d) is carried out. In this step, each value is 
compared with an individual acceptance limit which is de- 
fined for each template symbol. The template symbols 
whose values of the proximity measure are smaller than 
5 their respective acceptance limits are considered plausi- 
ble interpretations of the written character. Of these 
plausible interpretations, the template symbol is se- 
lected whose value is the smallest. On the contrary, if 
no value is smaller than its acceptance limit, a refined 

10 determination is made. The acceptance limits may also be 
one and the same for all of the template symbols. An ad- 
vantage of using individual acceptance limits is that 
more complicated characters, such as ''@", tend to have a 
fairly high value of the proximity measure also in case 

15 of correspondence, while simpler characters, such as "1", 
generally have a low value of the proximity measure in 
case of correspondence- Further variants are possible, 
some of which will be described below. 

Theoretically, the proximity measure has to fulfil 

20 fi(s(<p) , s(Vf)=0 when and i// are parameterisations of the 
same curve when the curves are obtained from each other 
with a positive similarity transformation. Since people 
when writing do not exactly stick to the permissible 
similarity transformations of the template symbols, the 

25 acceptance limits should, however, be selected to be 
greater than zero. 

On the one hand, the acceptance limits are therefore 
determined to be values which are >0, and on the other 
hand the case where no value is smaller than its accep- 

30 tance limit is not interpreted as if the written charac- 
ter does not have an equivalent among the template sym- 
bols. Instead, according to this embodiment a reparame- 
terisation is carried out, since the parameterisation af- 
fects the final result to a fairly large extent. A pre- 

35 ferred reparameterisation of the curve tf/ means that it is 
put together with a one-to-one function y: [0, 1] [0, 1] . 
For instance ;'(t)=l-t fulfils this, which means that the 
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character is written in the opposite direction. What sort 
of reparameterisation has to be done is determined by 
solving the problem of minimisation 



where the minimisation is performed over all of the y 
which have been described above. 

The above-described steps are then repeated and new 
values of the proximity measure are obtained. If none of 
these is below its acceptance limit, the written charac- 
ter is rejected and the user is informed about this, for 
instance by requesting him or her to rewrite the charac- 
ter. If one wishes to speed up the determination of the 
proximity measure after the reparameterisation^ a group 
consisting of the smallest, for example the three small- 
est, values of the proximity measures from the first de- 
termination can be selected and in the second determina- 
tion, only be compared with the template symbols that are 
included in the group. However, in certain cases this may 
produce a final result other than in the case where all 
the template symbols are taken into consideration in the 
second determination. 

The geometric proximity measure ji does not only re- 
sult in a ranking order between different interpretations 
of a character, but it also gives a measure of how simi- 
lar two characters are. This yields the possibility of 
also using the present method for verification and iden- 
tification, respectively, of signatures (initials are 
here perceived as signatures) . In this use, the arc- 
length parameterisation is, however, not a preferred type 
of parameterisation since it excludes information of the 
dynamics when writing. Such information is valuable in 
this use. There are, however, other variants that are 
more suitable. 

The preferred embodiment has hitherto been described 
on the basis of the fact that there are suitable template 



min/i(j(<*),5(t^/^oj^)) 



7 



wo 00/13131 



PCT/SE99/01448 



12 

symbols with which the written character can be compared. 
Furthermore, the description has been made for one char- 
acter. Normally, it is not separate characters, but run- 
ning text with complete words that are written on the 
5 display. From the user's point of view, it is desirable 
to be able to write running text, which demands much of 
the method. 

A problem in the context is that the union of char- 
acters may contain a plurality of character combinations - 

10 It is unreasonable to ask the user to write all possible 
characters or words as template symbols. 

At the same time, it is advantageous if a limitation 
of the shapes of the writing can be avoided. If the user 
were strictly limited, for instance, only allowed to 

15 write one character at a time so that the above-described 
case always exists, the situation is relatively clear, 
but not user-friendly. According to the invention, the 
user is allowed to write running text. It is thus diffi- 
cult to know where in the curve/curves, for instance, a 

20 character ends and starts. The points indicating the be- 
ginning and the end of a character are named breakpoints, 
and finding possible breakpoints adds complexity to the 
problem of recognition. This problem of complexity is 
solved in accordance with an embodiment of the method ac- 

25 cording to the invention in the following manner. It 

should be mentioned that the above steps are carried out 
in the same way in this embodiment- The following de- 
scription essentially concerns the step of preprocessing 
the union of characters and the step of comparing. 

30 If the pen is lifted after each character in a word, 

this may be taken advantage of. Each lifting of the pen 
gives rise to a discontinuity and may be detected by two 
points being relatively far apart in space or time. Natu- 
rally, this detection is carried out before the arc- 

35 length parameterisation. The union of characters here 

consists of n curves. The points of discontinuity may be 
taken as plausible breakpoints to distinguish two charac- 
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ters from one another. This focuses on the problem of 
characters containing several strokes that are being 
written by lifting the pen in between. Such a character 
will be represented by several curves by means of the de- 
5 tection of discontinuity. However, each curve may be pa- 
rameterised with rescaled arch-length, which means that 
each curve contains the same niomber of sampling points . 
Assume that ijy l2r"f curves and that sj^ is a 

composition of the curves I to k. Compare the composi- 

10 tions of curves si^ S2/"'r S]^ with the database of tem- 
plate symbols, where k is the largest number of curves 
included in any template symbol- Assume that sj^i is the 
longest composition of curves which gives correspon- 
dence/correspondences, i.e. which, when comparing with 

15 template symbols, gives one or more values of proximity 
measures that are below the acceptance limit/acceptance 
limits. Even if sj^i corresponds to one or more template 
symbols, it is not certain that this gives a correct in- 
terpretation. In accordance with this embodiment of the 

20 method, a plausibility test is therefore carried out, 

which will be described below. If the interpretation is 
not plausible, sj^i is shortened to the longest composi- 
tion of curves Sk2 one, which gives correspondence. 
The plausibility test is carried out once more. 

25 If no interpretation is plausible for any sj^, the 

best interpretation of si is selected. The remaining 
curves are processed correspondingly. Only the points of 
discontinuity are not sufficient as plausible breakpoints 
as far as coherent writing is concerned, but there may 

30 also be breakpoints within a curve. It is to be noted 
that as a matter of fact the above procedure to find 
breakpoints is achieved with reparameterisations of the 
composition of all written curves. 

The term "plausibility tests" covers, inter alia, 

35 so-called confidence sets. The above reasoning of the 

recognition of unions of characters consisting of several 
characters, and characters consisting of several curves. 
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respectively, will now be exemplified by means of Fig. 4, 
the confidence sets being used as plausibility tests. 

Assume that the written character is "ata" (English 
"eat"), i.e. a complete word written in accordance with 
5 Fig- 4a. By means of detection of discontinuities and 
reparameterisation with rescaled arch length, "a" has 
been identified and "t" is the next in turn. The horizon- 
tal as well as the vertical stroke can be interpreted as 
an "1", i.e. "t" can be interpreted as "11". The template 

10 symbols are stored with associated confidence sets ac- 
cording to Fig. 4b, where the template symbols "1" and 
"t" are shown with the respective confidence sets as the 
shaded area. Assume that the vertical stroke of "t" is 
interpreted as the template symbol "1". The transforma- 

15 tion a:52->52 may then be determined - within the class 
that generates the shape - which transfers the template 
symbol in the vertical stroke. If a is applied to the 
confidence set, the result of Fig. 4c is achieved. The 
next curve, i.e. the horizontal stroke, is in the confi- 

20 dence set, which is forbidden, and the interpretation is 
classified as implausible. The confidence sets do not 
need to be identified by only straight strokes, as those 
skilled in the art will realise, but may have a more gen- 
eral appearance. To each template symbol another confi- 

25 dence set can be connected which contains the first set. 
If then the next curve is also outside the second confi- 
dence set, it will be interpreted as if the next charac- 
ter is the first one in a new word. 

An alternative plausibility test means that the 

30 transformation which was determined in the description of 
confidence sets is studied- If the transformation is be- 
yond a certain scope, the interpretation will be classi- 
fied as implausible. Such scope may, for instance, deter- 
mine how much the transformation is allowed to turn the 

35 character in relation to how much earlier interpreted 

characters have been turned. Also excessive deformations 
may be excluded. In order to distinguish, for example. 
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"S'' from "s"r the enlargement of the transformation can 
be calculated in relation to the enlargement of symbols 
that have been interpreted before. 

The above-described embodiments of the method ac- 
5 cording to the invention should only be seen as non- 
limiting examples, and many modifications apart from the 
above-mentioned ones are possible within the scope of the 
invention as defined in the appended claims. Examples of 
further such modifications follow below. 
10 As an alternative to the above-described reparame- 

terisation, the decision is taken directly on the basis 
of the first determined smallest value of proximity meas- 
ure. 

Examples of other modifications are the choice of 
15 another proximity measure, various choices of values of 
acceptance limits that demand a certain adaptation to 
various users, different types of reparameterisation and 
different types of shape, for instance, an affine shape. 
As far as various types of shape are concerned, two 
20 or more shapes are, as an alternative, used in parallel 

for each union of characters. This means that several in- 
variants are provided for each union of characters and 
are then processed in parallel in the following steps. 
This gives a higher degree of accuracy and a faster rec- 

25 ognition. 

In practice, the method according to the invention 
can be used, for instance, in electronic notebooks and 
the like and in mobile telephones with an enhanced possi- 
bility of communication by a writable window. 

30 The method according to the invention can be imple- 

mented as a computer program in a computer by using a 
commercially available programming language for mathe- 
matical calculations, such as C, C++ or FORTE^N, or as a 
specially built device according to the invention, which 

35 will be described below. In both cases, the template sym- 
bols are stored as a database. If needed, the database 
can be changed. 
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As shown in Fig. 5^ an embodiment of a device, which 
realises the method, comprises a pressure-sensitive dis- 
play 42, a display communication unit 44 with a detector 
4 6, a control unit 48, a memory control unit 50, a memory 
5 unit 52 and a processing unit 54 . The display communica- 
tion unit 44, the control unit 48, the memory control 
unit 50 and the processing unit 54 communicate via a bus 
56 transferring data, address and control signals between 
the units. Unions of characters are written on the dis- 

10 play 42 and are detected by the detector 4 6, which pro- 
vides the ordered sequence of points. In the memory unit 
52, the template symbols and the detected unions of char- 
acters are stored. By means of the processing unit, cal- 
culation operations are carried out, which comprise the 

15 interpretation of the sequences of points as one or more 
curves, the parameterisation of each curve, the compari- 
son of the preprocessed union of characters with template 
symbols and the application of the decision rule. In the 
memory unit 50, also software for carrying out the method 

20 is stored. The control unit 4 8 runs the program and com- 
municates with the user via the display communication 
unit 44 and the display 42. 

The device is also adapted for optional settings 
which, inter alia, may comprise the choice of shapes, the 

25 choice of proximity measure, the choice of parameterisa- 
tions and the choice of decision rule. The choices are 
made via the display 42. 

Above, the description has essentially been made on 
the basis of the characters being written on a display 

30 and being detected at the same time as they are written. 
An alternative is that the characters are detected, for 
instance scanned, as they are already written on a piece 
of paper. This concerns handwritten characters as well as 
typewritten ones. Thus, the detection comprises, instead 

35 of the operation of recognising the display writing, the 
operation of reading (scanning) the characters on the 
piece of paper. Advantageously, read data is transformed 
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into said ordered sequence of points by edge detection. 
However, it is also a modification within the scope of 
the invention. In this embodiment, the preprocessing com- 
prises forming one or more characteristic curves, for in- 
5 stance the edge curve or edge curves of the character, on 
the basis of said edge detection and parameterisation . 
When the edge curves thus have been defined, the follow- 
ing steps are the same as in the above-described, pre- 
ferred embodiment . 

10 The decision rule may be selected in many different 

ways. A variant of the above-mentioned is that all the 
template symbols for which the value of the proximity 
measure below the acceptance limit is selected. Subse- 
quently, the template symbols may be processed further in 

15 accordance with any refined determination of the above- 
described type. It is also possible to make a combination 
with another selection method, which points out the most 
plausible alternative. One example of such a method is 
statistics of characters that indicate the probability of 

20 the presence of separate characters or compositions of 
characters in texts. 

Moreover, an alternative for determining the accep- 
tance limits is that the template symbols are grouped, in 
which case the same limit applies within a group. 

25 The method according to the invention is reliable in 

that it is able to recognise rather deformed characters 
and manages running text. The contents of the database 
are not crucial, but in principle a set of separate char- 
acters is sufficient. In order to recognise a variety of 

30 fonts and handwritings with a high degree of accuracy, it 
may, however, be an advantage to store several variants 
of each character, which comprise deformations that are 
outside the class of transformation which is appropriate 
and permissible in the comparison. It may also be advan- 

35 tageous to store certain compositions of characters, for 
instance to be able to more safely distinguish two I's 
"11", which are connected, from "u". 
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CLAIMS 

1. A method for character recognition comprising the 

5 steps of: 

- detecting a union of characters, 

- preprocessing the union of characters, 

- comparing the preprocessed union of characters 
with one or more template symbols, and 

10 - applying a decision rule in order to either reject 

a template symbol or decide that the template symbol is 
included in the union of characters, the step of preproc- 
essing the union of characters comprising the steps of: 

- representing the union of characters as one or 

15 more curves, and 

- parameterising said curve or curves, char- 
acterised in that the step of preprocessing the 
union of characters further comprises the step of form- 
ing, regarding various classes of transformation, one or 

20 more shapes for said curve or curves, and that the step 
of comparing comprises the steps of: 

- forming one or more geometric proximity measures, 

- determining for every shape the values of said 
geometric proximity measures between said shape and cor- 

25 respondingly determined shapes for the template symbols, 
and that the step of applying a decision rule comprises 
the step of: 

- selecting one or more template symbols in consid- 
eration of said values- 

30 2. A method as claimed in claim 1, charac- 

terised in that the step of detecting a union of 
characters comprises the step of representing the union 
of characters as a set of points, and that the step of 
representing the union of characters as one or more 

35 curves comprises the steps of: 

- generating an ordered sequence of points from said 
set of points, and 
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- interpolating between the points to generate said 
one or more curves. 

3. A method as claimed in claim lor2/ char- 
acterised in that the step of parameterising com- 
5 prises the steps of: 

- arranging according to a convenient rule of param- 
eterisation a function which follows the curve, and 

- sampling the function in a plurality of equidis- 
tant points. 

10 4. A method as claimed in claim 3, charac- 

terised in that the rule of parameterisation is an 
arc length. 

5. A method as claimed in any one of the preceding 
claims, the union of characters being detected on a dis- 

15 play on which it is written directly, character- 
ised in that the step of detecting is carried out 
during the writing. 

6. A method as claimed in any one of claims 1-4, 
characterised in that the union of characters 

20 is detected in a data quantity that originates from a 
scanner. 

7. A method as claimed in claim 6, charac- 
terised in that the step of preprocessing the union 
of characters comprises edge detecting the union of char- 

25 acters. 

8. A method as claimed in any one of the preceding 
claims, characterised in that the step of ap- 
plying a decision rule comprises determining acceptance 
limits of the values of said proximity measures and se- 

30 lecting a template symbol only if at least one value re- 
lated to the template symbol is within said acceptance 
limits . 

9. A method as claimed in claim 8, charac- 
terised in that individual acceptance limits are 

35 assigned to each template symbol. 
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10. A method as claimed in claim 8, charac- 
terised in that at least two template symbols have 
the same acceptance limits. 

11. A method as claimed in any one of claims 8-10, 

5 characterised in that the step of reparameter- 
ising a parameterised curve if all the values of said 
proximity measures between the shape of the parameterised 
curve and the template symbols are beyond the acceptance 
limits so that one or more values of the corresponding 
10 proximity measure decreases between the template symbols 
and the shape of the reparameterised curve. 

12. A method as claimed in any one of claims 8-11, 
characterised in that the acceptance limits 
are determined on the basis of the fact that only simi- 

15 larity transformations are permitted. 

13. A method as claimed in any one of the preceding 
claims, characterised in that the step of ap- 
plying a decision rule comprises carrying out a plausi- 
bility test of the selected template symbols. 

20 14. A method as claimed in claim 13, charac- 

terised in that the plausibility test is based on 
the confidence sets. 

15 • Use of the method as claimed in any one of the 
preceding claims for verification or identification of 

25 signatures. 



wo 00/13131 



PCT/SE99/01448 



1/3 



x^(0.2).X2(0.2)/^ A 



Fig. 1 



xi(1). X2(1) 













Fig. 4b 




Fig. 4c 




58 




IT 



60 



7' 



66 



Fig. 5 



62 




Fig. 2c F'S' 2d 



wo 00/13131 PCT/SE99/01448 

3/3 




Fig. 3c Fig- 3d 



INTERNATIONAL SEARCH REPORT 



International application No. 

PCT/SE 99/01448 



A. CLASSIFICATION OF SUBJECT MATTER 



IPC7: G06K 9/00, G06K 9/68 

According to International Patent Classificaiion (CPC) or to both naxional classification and IPC 



B. FIELDS SEARCHED 



Minimum documentation searched (classification system followed by classification symbols) 
IPC7: G06K 



Documentation searched other than minimum documentation to the extent that such documents are induded in the fields searched 

SE,DK,FI,NO classes as above 



Electronic data base consulted during the international search (name of data base and, where practicable, search terms used) 



C DOCUMENTS CONSIDERED TO BE RELEVANT 



Category* 


Qtation of document, witli indication, where appropriate, of the relevant passages 


Relevant to claim No. 


X 


US 5768420 A (M.BROWN ET AL.). 16 June 1998 

(16.06.98), column 3, line 53 - column 4, line 17; 
column 5, line 14 - line 34; column 6, 
line 42 - line 58 


1-10,12-14 


A 


US 55598897 A (M.K.BROWN ET AL.), 24 Sept 1996 
(24.09.96), column 10, line 28 - line 35 


1-14 


A 


EP 0782090 A2 (AT & T CORP), 2 July 1997 
(02.07.97), abstract 


1-14 



I y| Furtticr documents are listed in the continuation of Box C. [ x| See patent family annex. 



* Spedal categories of dted documents: 

'A* document defining the general state of the an whidi is not coaddered 

to be of particular rdcvanee 
'E* eriier document but published on or after the international filing date 

'L* document which may throw doubt! on priority claim(s) or ivhich is 
dted to establish the publication date of another dtation or other 
spedal reason (as spedfieiQ 

'O' document referring to an oral disdosure, use, exhibitian or other 
means 

"P" document published prior to the international filing date but later than 
the priority date daimed 



later dooiment published alter the international filing date or priority 
date and not in conflict uith the applicadon but dted to imderstand 
the prindple or theory underiying the invention 

document of particular rdcvanee: the daimed invention cannot be 
considered novd or cannot be considered to involve an inventive 
step when the document is taken alone 

document of particular relevance: the claimed invention caxinot be 
considered to involve an inventive step when the document is 
combined with one or more other such documents, sudi combinatitni 
being obvious to a person skilled in the art 



document member of the same patent family 



Date of Uie actual completion of the international search 
10 January 2000 


Date of mailing of the international search report 
17.01.2000 


Name and mailing address of the ISA/ 
Swedish Patent orfice 
Box 5055. S-102 42 STOCKHOLM 
FacsimUe No. + 46 8 666 02 86 


Authorized officer 

Patrik Blidefalk/AE 

Telephone No. + 46 8 782 25 00 



Form PCr/ISA/210 (second sheet) (July 1992) 



INTERNATIONAL SEARCH REPORT 



Intemaiional application No. 

PCT/SE 99/01448 



C (ContinuaUon). DOCUMENTS CONSIDERED TO BE RELEVANT 



Category* 



Giation of document, with indication, where appropriate, of the relevant passages 



Relevant to daim No. 



WO 9404992 Al (COMMUNICATION INTELLIGENCE 

CORPORATION), 3 March 1994 (03.03.94), page 4, 
line 32 - page 5, line 35 



WO 9720286 Al (MOTOROLA INC.), 5 June 1997 
(05.06.97), page 15, line 5 - line 14 



1-14 



13-14 



Form PCr/ISA/210 (continiution of second sheet) (July 1992) 



INTERNATIONAL SEARCH REPORT 
InfomiaUan on patent family members 



02/12/99 



International application No. 
PCT/SE 99/01448 



Patent document 
cited in search report 


Publication 
date 


Patent family 
inember(s) 


PuDuc&Uon 
date 


us 5768420 A 


16/06/98 


EP 


0762313 


A 


12/03/97 




JP 


9128485 


A 


16/05/97. 






US 


5559897 


A 


24/09/96 






us 


5875256 


A 


23/02/99 






us 


5878164 


A 


02/03/99 






CA 


2136369 


A 


22/07/95 






EP 


0664535 


A 


26/07/95 






JP 


7219578 


A 


18/08/95 






us 


. 5699456 


A 


16/12/97 






us 


5719997 


A 


17/02/98 






us 


. 5907634 


A 


25/05/99 



US 


55598897 


A 


24/09/96 


NONE 






EP 


0782090 


A2 


02/07/97 


CA 
JP 
US 


2190664 A 
9288729 A 
5828772 A 


28/06/97 
04/11/97 
27/10/98 


MO 


9404992 


Al 


03/03/94 


US 


5933514 A 


03/08/99 


WO 


9720286 


Al 


05/06/97 


AU 
US 


1124397 A 
5854855 A 


19/06/97 
29/12/98 



Form PCT/ISA/210 (patent family annex) (July 1992) 



This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 



Defective images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 

□ BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 

□ FADED TEXT OR DRAWING 



Jd^BLURRED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

□ LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 



BEST AVAILABLE IMAGES 




